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EFFICIENT ESTIMATION FOR A SUBCLASS OF SHAPE 
INVARIANT MODELS 

By Myriam Vimond 
CREST-ENSAI, IRMAR 

In this paper, we observe a fixed number of unknown 27r-periodic 
functions differing from each other by both phases and amplitude. 
This semiparametric model appears in literature under the name 
"shape invariant model." While the common shape is unknown, we in- 
troduce an asymptotically efficient estimator of the finite-dimensional 
parameter (phases and amplitude) using the profile likelihood and the 
Fourier basis. Moreover, this estimation method leads to a consistent 
and asymptotically linear estimator for the common shape. 

1. Introduction. In many studies, the response of interest is not a ran- 
dom variable but a noisy function for each experimental unit, resulting in 
a sample of curves. In such studies, it is often adequate to assume that the 
data Yij, the iih observation on the jth experimental unit, satisfies the 
regression model 

(1.1) >',., .//('/ j ) • rrj H,j, i = l,.. - ,nj,j = 1, . . . , J. 

Here, the unknown regression functions fj are 27r-periodic and may depend 
nonlinearly on the known regressors tij S [0,2-71"]. The unknown error terms 
c*£ij are independent zero mean random variables with variance a*- 1 . 

The sample of individual regression curves will show a certain homogene- 
ity in structure, in the sense that curves coincide if they are properly scaled 
and shifted. In other words, the structure would be represented by the non- 
linear mathematical model 

(1.2) f*(t) = a*f*(t-9*) + v* VtGM,Vi = l,...,J, 

where the shift 9* = (0J)j=i,...,j, the scale a* = (a*) J= i v .. i j and the level 
v* = (vj)j = i r „ t j are vectors of K J and the function /* is 27r-periodic. This 
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semiparametric model was introduced by Lawton, Sylvestre and Maggio [7] 
under the name of shape invariant model. We have both a finite-dimensional 
parameter (6*,a*,v*) and an infinite-dimensional nuisance parameter /* 
which is a member of some given large set of functions. A general feature 
of semiparametric methods is to "eliminate" the nonparametric component 
/*, thus reducing the original semiparametric problem to a suitably chosen 
parametric one. 

Such models have been used to study child growth curves (see [6]) or to 
improve a forecasting methodology [8] based on speed data of vehicles on 
a main trunk road (see [2] for more details). Since the common shape is 
assumed to be periodic, the model is particularly well adapted for the study 
of circadian rhythms (see [15]). Our model and our estimation method are 
illustrated with the daily temperature of several cities. 

The main goal of this paper is to present a method for the efficient es- 
timation of the parameter (6* ,a* ,v*) without knowing /*. The question of 
estimation of parameters for the shape invariant model was studied by sev- 
eral authors. First, Lawton, Sylvestre and Maggio [7] proposed an empirical 
procedure, SEMOR, based on polynomial approximation of the common 
shape /* on a compact set. The convergence and the consistency for SE- 
MOR was proved by Kneip and Gasser [6]. Hardle and Marron [5] built 
a y/n -consistent estimator and an asymptotically normal estimator using 
a kernel estimator for the function /*. Similar to Guardabasso, Rodbard 
and Munson [4], Wang and Brown [15] and Luan and Li [9] used a smooth- 
ing spline for the estimation of /*. The method of Gamboa, Loubes and 
Maza [2] provides a -^/n-consistent estimator and an asymptotically normal 
estimator for the shift parameter 9*. This procedure is based on the discrete 
Fourier transform of data. Our estimation method is related to the method 
of Gamboa, Loubes and Maza [2]: The common shape /* is approximated 
by trigonometric polynomials. 

The efficiency of the estimators is to be understood as asymptotic unbi- 
asedness and minimum variance. To avoid the phenomena of super-efficiency 
(e.g., Hodges estimators), the efficiency is studied in a local asymptotic sense, 
under the local asymptotic normality (LAN) structure. The usual approach 
for determining the efficiency is to specify a least favorable parametric sub- 
model of the full semiparametric model (it is a submodel for which the Fisher 
information is the smallest), locally in a neighborhood of /*, and to estimate 
(6* ,a* ,v*) in such a model (see [12, 13]). Here, we consider the paramet- 
ric submodel where /* is a trigonometric polynomial. The method which 
is used is close to the procedure of Gassiat and Levy-Leduc [3] where the 
authors estimate efficiently the period of an unknown periodic function. The 
profile log-likelihood is used in order to "eliminate" the nuisance parameter 
and to build an M-estimation criterion. Moreover the efficiency of the M- 
estimator of (6*,a*,v*) is proved by using the theory developed by McNeney 
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and Wellner [10]: The authors develop tools for nonindependent identically 
distributed data that are similar in spirit to those for independent identically 
distributed data. Thus the notions of tangent space and of differentiability 
of the parameter (9* , a* ,v*) are used in order to specify the characteristics 
of an efficient estimator. Under the assumptions listed in Theorem 3.1, the 
estimator of (0*,a*,v*) is asymptotically efficient. This follows the conclu- 
sions of Murphy and Van der Vaart [11]: Semiparemetric profile likelihoods, 
where the nuisance parameter has been profiled out, behave like ordinary 
likelihoods in that they have a quadratic expansion. 

The profile log-likelihood induces the definition of an estimator for the 
common shape. Corollary 3.1 establishes the consistency of this estimator. 
The rate of the regression function estimator is the optimal rate in nonpara- 
metric estimation [12], Chapter 24. Using the theory developed by McNeney 
and Wellner [10], we discuss its efficiency: the estimator is asymptotically 
linear. But the Fourier coefficients' estimators are efficient if and only if the 
common shape /* is odd or even. Even if this condition is satisfied, we can 
not deduce that the estimator of /* is efficient because it is not regular. 

This work is related to [14], Chapter 3, where we propose another criterion 
which allows us to estimate efficiently the parameter (9* ,a* ,v*). This crite- 
rion, which is similar by its definition to the criterion proposed by Gamboa, 
Loubes and Maza [2] and [14], Chapter 2, allows us to build a test procedure 
for the model. 

The rest of the paper is organized as follows: Section 2 describes the 
model and the estimation method. In Section 3, we discuss the efficiency of 
the estimator. All technical lemmas and proofs are in Section 4. 

2. The estimation method. 

The description of the model. The data (Yij) are the observations of J 
curves at the observation times (Uj). We assume that each curve is observed 
at the same set of equidistant points 

i — 1 

U = U j = 27T £ 0, 2tt\, i = 1, . . . , n. 

n 

The choice of the observation times ti is related with the choice of quadrature 
formula (see Remark 2.1). The studied model is 

(2.1) Y l j = a*r(t l -9*)+v* + a*e i:j , j = 1, . . . , J,i = 1, . . . , n. 

The common shape /* is an unknown real 2-7r-periodic continuous function. 
We denote by J- the set of 2-7r-periodic continuous functions. The noises 
(ejj) are independent standard Gaussian random variables. For the sake of 
simplicity, we get a common variance a* 2 = o~* 2 ,j = 1, . . . , J. However, all 
our results are still valid for a general variance. 
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The model is semiparametric: a* = (9* , a*, v* , a*) is the finite-dimensional 
parameter and /* is the nuisance parameter. Our aim is to estimate effi- 
ciently the internal shift 9* = (#J)j=i,...,j, the scale parameter a* = (ajj)j=i t ... t j 
and the external shift v* = (fj)j=i,...,j without knowing either the shape /* 
or the noise level a*. We denote A = [0,27r] J x R j x [— v mSLX , v max ] J as the 
set where the parameter (9*,a*,v*) lies. 

The identifiability constraints. Before considering the estimation of pa- 
rameters, we have to study the uniqueness of their definition. Indeed, the 
shape invariant model has some inherent unidentifiability: for a given pa- 
rameter (9o,ao,vo) G M 3 and a shape function fo we can always find an- 
other parameter (9x,ax,vx) G M 3 and another shape function fx such that 
a ofo(t — &o) + = a ifi(t — 9\) + v\ holds for all t. 

Then we assume that the true parameters lie in the following spaces: 

/* G^O={/G^,c (/) = ^/(t)^=o| and (6*,a*,v*) G Aq, 

where Aq = < (0, a, v) € A, Ox = 0, ah = J and a\ > > . 

I U i 

The constraint on the common shape allows us to uniquely define the param- 
eter v* [v* = Co (/,*)) j = I, • • • j J] and to build asymptotically independent 
estimators (see Remark 3.1). The constant f max is a user-defined (strictly 
positive) parameter which reflects our prior knowledge on the level param- 
eter. The constraints 9\ = and a± > mean that the first unit (j = 1) is 
taken as "reference" to estimate the shift parameter and the scale param- 
eter. At last, the constraint Ylj=i a j = J means that the common shape is 
defined as the weighted sum of the regression functions /* (1.1). This con- 
dition is well adapted to our estimation criterion (see the next paragraph 
on the profile likelihood). 

The profile log-likelihood. Maximizing the likelihood function directly is 
not possible for higher-dimensional parameters, and fails particularly for 
semiparametric models. Frequently, this problem is overcome by using a 
profile likelihood rather than a full likelihood. If l n (a,f) is the full log- 
likelihood, then the profile likelihood for a G ^4o is defined as 

pl n (a) = sup l n {a,f). 

The maximum likelihood estimator for a, the first component of the pair 
(a n ,f n ) that maximizes l n (a,f), is the maximizer of the profile likelihood 
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function a— >pl n (a). Thus we maximize the likelihood in two steps. With 
the assumptions on the model, we shall use the Gaussian log-likelihood, 

(2.2) l n (a, /) = ^2 EE( y v " a if& ~ °i) ~ "if ~ \ ^ ■ 

i=l j=l 

Generally, the problem of minimization on a large set is solved by the 
consideration of a parametric subset. Here, the semiparametric problem is 
reduced to a parametric one: / is approximated by its truncated Fourier 
series. Thus the profile likelihood is approximated by minimizing the likeli- 
hood l n on a subset of trigonometric polynomials. More precisely, let (m n ) n 
be an increasing integer's sequence, and let be the subspace of Fq of 
trigonometric polynomials whose degree is less than m n . In order to preserve 
the orthogonality of the discrete Fourier basis, 



11 2' ]n 2 n*-i 10, if l^p, 



r=l 

we choose m n and n such that 

(2.3) 2|m n | < n, lim m n = +oo and n is odd. 



n— >+oo 



After some computations, the likelihood maximum is reached in the space 
J~o,n by the trigonometric polynomial 

(2.4) f a (t)= *( a ) e<lt VtGR ' 

l<|/|<m n 

where for I G Z, 1 < |/| < m n , 

/J \ J n 

c l (a)=inJ2 a ) E a J E & J ~ v i ) e ~ m ~° j ] Vol€AqxR%. 

V j=l / j=l i=l 

(2.5) 

Finally, using the orthogonality of the discrete Fourier basis, the following 
equality holds: 

j=l i=l \ l<\l\<m n / 

J n / J \ 

=EE(^-^) 2 - -Es 2 E 

j=l »=1 V 3=1 ' l<\l\<m n 

J 



2 



+ n ^ q(a)cp(Q!)^ n f- — -j^o 
l<|Z|,|p|<m„,Z/p \ n J j=1 
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where <p n {t) = 2™=l e 2j7rst /n. Let M n be the function of a = (9, a, v) defined 
as 

.. J n 

M » = ^EE^-^) 2 - E i*(«)i a - 

j=l i=l l<|Z|<m n 

With the identifiability constraints of the model, the profile log-likelihood 
pl n is equal to 

(2.6) pl Il (a) = -(nJ)^ ^-logcr. 

Remark 2.1. The estimation method requires the estimation of the 
Fourier coefficients of the common shape. A natural approach for estimat- 
ing an integral is to use a quadrature formula which is associated with the 
observation times ij. In this paper, the observation times are equidistant. 
Therefore the quadrature formula is the well-known Newton-Cotes formula. 
Even if another choice of the observation times is possible (see [14], Chap- 
ter 2), this formula defines the discrete Fourier coefficients c?(f) which are 
an accurate approximation of q(/): 

c?(n = -^2r(ts)e- ut °^c l (n= / r(t) e - ut ^. 

njri 7 2vr 

Moreover, the stochastic part of the coefficients (2.5) are linear combinations 
of the complex variables Wjj, 

1 n 

wu = - y^e~ lltr £ij, j = 1, . . . , J, \l\ <m n . 

r=l 

Due to Cochran's theorem, these variables are independent centered complex 
Gaussian variables whose the variance is equal to 1/n. This property is 
related to the convergence rate of the estimators (see [14], Chapter 2, for 
more details, and [3] to compare). 

The estimation procedure. Consequently, the maximum likelihood esti- 
mator of the finite-dimensional parameter is defined as 

Pn = arg min M n {j3) or a n = {j3 n ,v n ) = arg max pl n (a). 

Then, the estimators of the common shape are the trigonometric polynomi- 
als, which maximize the likelihood when a = a n : 

fn(t)=fa n (t)= E d l(»n)e Ut Vt E R. 
l<\l\<m n 

First, we study the consistency of the estimator of (8* ,a* ,v*). The consis- 
tency of the common shape estimator is studied in the next section. 
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Theorem 2.1 (Consistency). Assume that 2tt is the minimal period of 
/*, and that 




Then a n converges in probability to a*. 



The assumption regarding the common shape means that the function /* 
is a 1/2-holder function. The assumption on the number of Fourier coeffi- 
cients means that m n has to be small in relation to the number of observation 
n. Notice that Theorem 2.1 is still valid if the noises (£ij) are (centered) 
independent identically distributed with finite variance. 

Proof of Theorem 2.1. The proof of this theorem follows the classical 
guidelines of the convergence of M-estimators (see, e.g., Theorem 5.7 of Van 
der Vaart [12]). Indeed, to ensure consistency of f3 n , it suffices to show that: 

(i) The uniform convergence of M n to a contrast function M + o~* 2 (Lem- 
ma 4.1): 

sup \M n (0) -M{fi) -a* 2 \ = 0P * (1), 
where M is defined as 

= [ -j £(//(') - ^) 2 | - jf (E - o* + 

(ii) M(-) has a unique minimum at /3* (Lemma 4.2). □ 

The daily temperatures of cities. The estimation method is applied to 
daily average temperatures (the average daily temperatures are the average 
of 24 hourly temperature readings). The data come from of the University of 
Dayton (http://www.engr.udayton.edu/weather/). In order to illustrate 
the method, we limit the study to three cities which have a temperature 
range of an oceanic climate: Juneau (Alaska, city j = 1), Auckland (New 
Zealand, city j = 2) and Bilbao (Spain, city j = 3). An oceanic climate is 
the climate typically found along the west coasts at the middle latitudes of 
all the world's continents, and in southeastern Australia. Similar climates are 
also found on coastal tropical highlands and tropical coasts on the leeward 
sides of mountain ranges. Figure 1(a) plots the sample of temperature curves. 

If we assume that the data fit the model (2.1), the parameters 9*, a* and 
v* have the following meanings: 

• Vj is the annual temperature average of the zth city, 
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Table 1 

Estimators of the parameters 6%, #3, a*, 0.2, v*, v% and W3 



City 3 


i — 1 


J — 2 


i = 3 


6j >n (days) 





12.5182 


25.35381 




1.2421 


-0.5833 


1.0569 


Vj >n (Fahrenheit) 


43.9874 


58.5312 


60.1814 



• a* indicates whether the city is in the same hemisphere as the first city 
(ai- > 0) and measures the differences between the winter and summer 
temperatures, 

• 6*- is the seasonal phase of the ith city, 

• /* describes the general behavior of the temperature evolution of the 
oceanic climate. 

The estimators of these parameters are given in Table 1. 

Figure 1(b) plots the estimator of the common shape. The number of the 
Fourier coefficients used to estimate the common shape is m n = 5. Further 
study will yield the most accurate number m n , and leads to studying the 
estimation problem from the point of view of the selection model. 

3. Efficient estimation. 

3.1. The LAN property. Before studying the asymptotic efficiency of the 
estimators, we have to establish the local asymptotic normality of the model. 
First, let us introduce some notation. The model is semiparametric. The 



(a) (b) 




Fig. 1. (a.) Plots of the temperature curves associated with Juneau (Alaska), Auckland 
(New Zealand) and Bilbao (Spain) in 2004. (i>) Plot of the estimator of the common shape 
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finite-dimensional parameter a* lies in Aq x R* + . The nuisance parameter 
/* lies in F . For (a,/) € Ao X R + X Fq and t <E R, we denote by P a ,/(*) 
the Gaussian distribution in R J with variance <7 2 // and mean (ajf(t — 0j) + 
j. Then the model of the observations is 

V n = = (g) P(a,/)(*i), («,/) € A X M+ X Joj. 

To avoid the phenomenon of super efficiency, we study the model on a 
local neighborhood of (a*,/*). Let (a n (h), f n (h)) be close to (a*,/*) in 
the direction h. The LAN property requires that the log-likelihood ratio 
for the two points (a* , /*) and (a n (h) , f n (h)) converges in distribution to a 
Gaussian variable which depends only on h. 

Since the observations of our model are not identically distributed, we 
shall follow the semiparametric analysis developed by McNeney and Wellner 
[10]. The LAN property allows identification of the least favorable direction h 
that approaches the model, and thus allows us to know whether the estimator 
is efficient. Let us denote the log-likelihood ratio for the two points (a*,/*) 
and (a, /) 

cflP (n) 

Ma,/) = log aJ 



cflP (n) 

a*,/* 



Proposition 3.1 (LAN property). Assume that the function f* is not 
constant and is differentiable with a continuous derivative denoted by df*. 
Assume that the reals eft, j = 1, . . . , J, are nonnull. Considering the vector 
space U = R J ~ X x R J ~ X x R J xR + x Fo, the coordinates of a vector h Grl 
are denoted as follows: 

h = (hep, . . . , ho t j, h at 2, . . . , h a) j, h V) 2, . . . , h v ,j,hcr, hf). 
Then the space H is an inner-product space endowed with the inner product 
<v>, 

\ j=2 1 

J a* \ 
+ v 1 ,a* 1 ti f -J2h'a, j ^f*+v , 1 ) 

0=2 ^ I L2 

1 J 

+ ^2 Y,^ k f + " h 0J a j 9 f* 

+ h v>jl a\h' f + h' aJ f* ~ h' ej a*dr + h 
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where (-,-)l 2 is the inner product in L 2 [0, 27r]. Moreover, the model (2.1) is 
LAN at (q*,/*) indexed by the tangent space 7~L. In other words, for each 
h£H, there exists a sequence (a n (h) , f n (h)) such that 

A n (a n (h)J n (h)) = A n (h) - \\\h\\l_ + op(l). 

Here, the central sequence A n (/i) is linear with h, 

.. n J 

A n (h) = -7= i ~ !) + Af^j/a*}, 

v n ,, =li=1 

where for all i = 1, . . . , n, 

<i/(*i)-E^4/*&)> if 3 = 1> 



At Ah) 



- hgjapriU - 9*) +Vj, if j = 2, . . . , J. 



Notice that for the independent identically distributed semiparametric 
models, the fact that the tangent space would not be complete does not 
imply the existence of a least favorable direction. In our model the tangent 
space % is a subset of the Hilbert space 

H = R J ~ 1 x R- 1 - 1 xl J xRx {/ gL 2 [0,27t],c (/) = 0}, 

endowed with the inner product (•,•}. Consequently, it is easier to determine 
the least favorable direction using the Riesz representation theorem. 

3.2. The efficiency. The goal of this paper may be stated as the semi- 
parametric efficient estimation of the parameter i/ n {F^J ^, ) = (9%, ■ ■ ■ , 0j, a^, 
. . . ,aj,i)J, . . . , v%). This parameter is differentiable relative to the tangent 
space W, 

i^v^(^(Pi:v ))/nW )-^(pS/*)) 

= (he t 2, • • • , hg t j, h a> 2, h a j, h v> 2, • • • , h v j). 

Consequently, there exists a continuous linear map i> from T-L^ J ~ 2 on to 
R 3,7-2 . According to the Riesz representation theorem, there exist 3 J — 2 
vectors (i / j)2<j<j, (^j)2<j<j and (z>J)i<j<j of H such that 

V/i£H (i>j,h)=hg t j, (Vj,h) = h a ,j and (Uj,h) = h v j. 

These vectors are defined in Lemma 4.3. Using the linearity with h of A n (h), 
the following proposition, which is an application of Proposition 5.3 of Mc- 
Neney and Wellner [10] , links the notion of asymptotic linearity of an esti- 
mator and the efficiency. 
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Proposition 3.2 (Asymptotic linearity and efficiency). Let T n be an 
asymptotically linear estimator of u n (¥^J ^) with the central sequence 

(A n (hi), A n (h e j),A n (h a 2 ), A n (h a j), A n {h v j)). 
T n is regular efficient if and only if for all j h® = z>j\ hj = z>" and Kj = z>J. 

From Lemma 4.3, if the assumptions of Proposition 3.1 hold and if the 
estimator f3 n = (9 n ,a n ,v n ) is asymptotically linear, it is efficient if and only 
if 



V^(e n - 9*) 

\fn(a n - a*) 



a 



E 



\\df*U 

—Y 

II f*IU 



i=i 



n ; 



i=l 



J 



-D- 1 dF*(ti)ei,. + op(l), 

Ij-i-jA'A F*(ti)W.+op(l), 



-.4 



Vn(v n - v*) = cr*y^g i; . + op(l) where t e ir = t (e il ,. j), 



i=i 



where D is the diagonal matrix diag(a2, . . . , a%) and A = t (a,2, ■ ■ ■ , uj 
vector in IR^ -1 . F*(t) and dF*(t) are, respectively, the diagonal matrix 
diag(/* x (t - Of), f*(t - 9})) and diag(5/*(t -Of),..., df*(t - 9})) for 
all f Gl. We deduce the following theorem: 

Theorem 3.1 (Efficiency). Assume that the assumptions of Proposition 
3.1 hold and that 



a j a 



(3.1) 



(3.2) 



£iin*(r)i<°°, 



m i n /n = o{l). 



Then (9 n ,a n , v n ) is asymptotically efficient and yjn(6 n — 9* ,a n — a* ,v n — v*) 
converges in distribution to a Gaussian vector N%j -2(0, a* 2 H" 1 ), where H 
is the matrix defined as 



H 



( \\df*\\l 2 (D 2 -jA* 
















\ 



11/ 



* l|2 

II 2 



I + —^A A 



IjJ 
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and its inverse matrix H 1 is equal to 



(. 



H 



-l 



||<9/*" 2 



D~ 2 + 






J-i 



II/* 112 



L 2 





/j-l 









J 



-A*A 







Proof. Recall that the M-estimator is defined as the minimum of the 
criterion function M n (-). Hence, we get 

VM n n ) = 0, 

where V is the gradient operator. Thanks to a second-order expansion, there 
exists (3 n in a neighborhood of /3* such that 

V 2 M n (/3 n )^(/3„ - n = -V^VM n (/r ), 

where V 2 is the Hessian operator. Now, using two asymptotic results from 
Proposition 4.1 and from Proposition 4.2, we obtain 



V^0 n - 9* 

y/n(a n — a* 
y/n(v n - V* 



||9/*" 2 



. J2 



a" 



ll/*P 



L- 

a*G v n + o ¥ (l). 



d ~ 2 + ^h-Af-l ) 



Ij-i - -J A A)G^ + op(l) 



□ 



Remark 3.1. The choice of the identifiability constraints is important 
for the relevancy of the estimation. For example, if we no longer assume that 
co(/) is null, we may consider the following parameter space: 

Ai = | (9, a, v) E A, such that 9\ = 0, a 2 = J and a\ > 0^ and / G T . 

Consequently we have to estimate 3 J — 3 parameters: 0%,. .., 6j, a^, ■ ■ ■ , Oj, 
and t>2) • • • i v *j ■ This choice modifies the estimation criterion and the tan- 
gent space, too. Nevertheless, if the assumptions of Theorem 3.1 hold, the 
estimator is asymptotically efficient. But its covariance matrix is not block 
diagonal any more: 



lis/* I 



D~ 



ll/*l£ 2 -co(/*) 2 

-C0(f) 



B 



II/* co (/*) 2 





-<*(/*) 
llfll^-coC/*) 2 

11/%-coC/*) 2 



\ 



1. 1-1 

B- 1 
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OS 
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■ 
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L 
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0.55 
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_L 


1.Z4 


















l« 






| D.5 
1 






i 0-5 




| MB 








1 o 

I 


















0.45 






























1 16 






OA 






oa 












-OS 












1.16 






0.35 






02 




1.18 












J. 


1.14 




L 


D.3 






0.1 



*■ 




j. 




-1 







S "i 



Fig. 2. Boxplots of the estimators of 62 0,2 and associated with the space parameter Ao 
(a)andAi (b). The data are generated with f*(t) = 20*t/(2ir)(l-t/(2ir)), 9* = (0 0.8), 
a* = (0.75 1.1990), v* = (7.5/3 0.5) and n = 201. T/ie boxplots are computed from 100 
sets of data. 



where B = — l A with B 1 = + -^A 54. In other words, a n and 

t) n are not asymptotically independent: modifying the identifiability con- 
straint co(/*) = damages the quality of the estimation. 

To illustrate this phenomenon, we present the boxplots of the estimators 
which are relatively associated with the parameter space Ao [Figure 2(a)] 
and Ai [Figure 2(b)]. Let (a*,/*) be a parameter of the model. With the 
constraints associated with the parameter space .Ao, we have to estimate 9 2 , 
a 2 , v\ and v 2 for the following model ( J = 2): 

f Y iA = a\f*{ti) + v\ + e iA , i = l,...,n, 

\ Yi,2 = a* 2 f*{ti - 9* 2 ) + v* 2 + £ i>2 , i = l,...,n. 

With the constraints associated with the parameter space Ai, we have to 
estimate 9 2 , a 2 and v 2 . The data may be rewritten as 

{Y it i=alg*(ti) + e iA , i = l,...,n, 

\ Y i)2 = a* 2 g*(ti - 9* 2 ) + v 2 + e i>2 , i = l,...,n, 

where g* = f* + v\ and v 2 = v 2 — a 2 v\. After generating several sets of data 
from a parameter (a*, /*) which we have chosen, we have computed the esti- 
mators of 9 2 , a 2 and v 2 for every set of data. Figure 2 presents the boxplots 
of the estimators of 9%, and v 2 for these two models. 

As a consequence of the previous theorem, the Gaussian vector G n con- 
verges in distribution to a centered Gaussian vector Af3j- 2 (0, H), and the 
equation holds: 

vH^n " P*) = (^/a* 2 )" VG n + Op(l). 
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Comparing this formula with the results of the independent identically dis- 
tributed semiparametric model (see [12]), we identify the efficient informa- 
tion matrix as H/a* 2 and the efficient score as a*G n . 

Indeed, let X\, . . . ,X n be a random sample from a distribution P that is 
known to belong to a set of probabilities {P# v , 0G0CR d ,r/€^}. Then an 
estimator sequence T n is asymptotically efficient for estimating 9 if 

MT n -0) = (i^r 1 (-1= J2 kv( x i)J + <*(i), 

where lg v is the efficient score function, and Ig^ v is the efficient information 
matrix. 

Moreover, our result follows Murphy and Van der Vaart [11]. The au- 
thors demonstrate that if the entropy of the nuisance parameters is not too 
large and the least favorable direction exists, the profile likelihood behaves 
very much like the ordinary likelihood and the profile likelihood correctly 
selects a least favorable direction for the independent identically distributed 
semiparametric model. This holds if the profile log-likelihood pl n verifies the 
following equation: 

Pln(6n) -Pln(0) 

n 

= le^x^On -o)- \n\e n - e)i e , v {dn -o) + op(^\\e n - e\\ + 1) 2 , 

i=l 

where 6 n maximizes pl n . Then, if Ig^ is invertible, and 9 n is consistent, 6 n 
is asymptotically efficient. 

For our model, a similar asymptotic expansion holds. Indeed, by a Taylor 
expansion, there exists a n such that 

pln(a n ) -pl n (a*) 

= n x i 2 G n n -n-^ Kk - n^siPn - n 

Zi (J 

+op(n 1 / 2 ||/3 n -/3*|| + l) 2 . 

3.3. Asymptotic linearity of the common shape estimator. In this subsec- 
tion, we study the consistency and the characteristics of the estimator of the 
common shape which is defined in Section 2. We show that the convergence 
rate of this estimator is the optimal rate for the nonparametric estimation. 

Corollary 3.1. Assume that f* is k times continuously differ entiable 
with Jq W \ f( k \t)\ 2 alt < oo and k > 1. Furthermore, suppose that the assump- 
tions of Theorem 3.1 hold; then there exists a constant C such that for a 
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large m n 

S i/-w-rwi=*(^ + ^), 
f E(/„, t) -r ( ^< c (-L + ^). 

Consequently, for m n ~ n 1 /' 2 ^ 1 ', we ftoue MISE f *(f n ) = 0(n~ 2k ^ 2k+1 ^). 

Let B represent the Banach space denned as the closure of J- for the 
L 2 -norm 

B = {/ £ L 2 [0,2vr] such that c (/) = 0}. 

Here, the studied sequence of parameter u n is not (8*,a*,v*) any more, but 
it is the truncated Fourier series of /*: 

^( p S/*)= E «(/V (0 - 

|i|<m„ 

The parameter sequence i/ n is differentiable: 

Thus, there exists a continuous linear map z> from % on to B. To have a 
representation of the derivative z>, we consider the dual space B* of B. In 
other words, for b* £ B*, 6*z> is represented by z> b * £ %: 

\/h£H b*0{h) = (O h \h) = b*h f . 

Furthermore, the dual space B* is generated by the following linear real 
functions: 

f 2w dt 
b* u :feT ^ f(t)cos(lt) — and 
Jo 27r 

f 2n dt 
b* 2l :feT ^ J /(t)sin(Zi) — , leZ*. 

Thus it suffices to know i/ h u and v b u for all I £ Z* in order to deter- 
mine all {zV ,?>* £ B*}. After straightforward computations, these vectors 
are 

u b * t = (0,cos(/-)/J) and u b * t = (0,sin(/-)/J). 

The estimator of the common shape is asymptotically linear. This means 
that for all 6* £ B* there exists h b * £ % such that 

(3.3) \fcb*(T n - v n (K*,f*)) = ^n(h b *) + op(1). 
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Since {6^, 6^,/ € Z*} generates the dual space of B, Lemma 4.4 ensures the 
asymptotic linearity of f n . 

Now, we discuss the regularity and the efficiency of this estimator. We 
deduce from Proposition 5.4 of McNeney and Wellner [10] that: 

Corollary 3.2. b* f n is a regular efficient estimator of b* f* for all 
b* £ B* if and only if the function f* is odd or even. In particular, in this 
case, the estimator of the Fourier coefficients of f* is efficient. 

Consequently, f n is eventually regular and efficient if the common shape 
/* is odd or even. But the fluctuations y/n{T n — u n {¥^ ^ ^)) do not con- 
verge weakly under P„ ^ j ^ to a tight limit in B for each {a n (h), f n (h)} 
[e.g., take h = (0,0)]. Thus, even if /* is odd or even, f n is not efficient. 

Remark 3.2. The model where the function /* is assumed to be odd 
or even has been studied by Dalalyan, Golubev and Tsybakov [1]. In this 
model, the identifiability constraint "#i = 0" is not necessary: The shift pa- 
rameters are defined from the symmetric point 0. Thus the estimator of 
9*,..., 9 j would be asymptotically independent. Moreover the estimation 
method would be adaptative. 

4. The proofs. 

4.1. Proof of Theorem 2.1. 

Remark 4.1. Let us introduce some notation. First the deterministic 
part of q (2.5) is equal to 




j=i i=i 




(4.1) 



c l (n<P(W-W*,a)+g l n ((3) 

where g l n {P)= ]T c p (/*)«KZ0 -p0*,a) and 



|p|>m 
p—ldnL 



J 

3=1 
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Since assumption (2.3) holds, the term g l n is bounded by 
(4-2) \g l M\< E M/*)l- 

2\p\ >n 

For j = 1, . . . , J and |Z| < m n , let us denote the variable as lo^y = £,j,i/\/n. 
Then the variables ^ are independent standard complex Gaussian variables 
from Remark 2.1. Thus the stochastic part of q is equal to 



(43) ^*)=^Ev"'& -th | W) |< ^Efoi- 

Lemma 4.1 (The uniform convergence in probability). Under the as- 
sumptions of Theorem 2.1, we have 



sup |M„(/3) - M{fi) -a* 2 \= o PS (l 



where M{0) = M l {p) + M 2 (/3), 



1 J 

M\f3) = y £\c l (f)\ 2 (l-me-W*,a)\ 2 ) and M 2 (/5) = jE>i " vj 



3=1 



Proof. The contrast process may rewritten as the sum of three terms: 
M n {fi) = D n (/3) + a*L n (/3) + a* 2 Q n ((3). 
The term D n (f3) = D^(f3) — D^(/3) is the deterministic part where 



J f n 
.7=1 I i=l 



^(/3)= E 

l<\l\<m r 



I — p 



n 



<f>(lO-p0*,a) 



The term L n (j3) = L\{[3) — L"^(/3) is the linear part with noise, where 



J n 



lltj 



j=l i=l 



l<|/|<m n Uez \ n J 
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The term Q n {P) = QniP) — Qn(P) is the quadratic part with noise: 

J n 

Q 1 M = ^T,T,<i and = - E igg*)I 2 - 

3=1 i=l l<|Z[<m„ 

From the weak law of large numbers, Q\ does not depend on [3 and 
converges in probability to 1. Furthermore, Q 2 is bounded by 

J 

0<Q 2 n(P)<Q* where nJQ* = ]T £|&i| 3 . 

\l\<m n j=l 

Then assumption (2.7) induces that sup^g^ \Q n {P) — 1| converges to in 
probability. 

Using the fact that /* is continuous and that \vA < v max , there exists a 
constant c > such that for all /3 £ Aq we have 



<cl} n B where = i- 

nJ 



J n 

j=i i=i 



Then we deduce that L\ converges uniformly in probability to 0. Concerning 
the term L 2 , it may be written as the sum of two variables L 21 and L 22 : 

Kl<\l\<m n ) 
v^L 22 (/3) = 2k| &(P)W)\- 

il<\l\<m n J 

Due to assumption (2.7), y / nL 21 (-) is bounded by the following variable, 
which is tight: 

2 y E N(/*)lXXvl- 

l<|«|<m„ i=l 

Thus, L 21 converges uniformly in probability to 0. Similarly, L 22 is bounded 

by 

^ b 4(em/i) E Ei^i- 

\|2p|>n / |i|<mni=l 

Consequently, from assumption (2.7), L 22 converges uniformly in probability 
to 0. Therefore, L n converges uniformly in probability to 0. 
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It remains to prove that D n converges uniformly to M. First it is easy to 
prove that D\ converges to D 1 and D 2 converges to D 2 , where 

D\f3) = Y,\ci(rW0~l0* 



"\a)\ 2 . 



/el 



Consequently, D n pointwise converges to M = D 1 — D 2 . We prove now that 
the convergence is uniform. For all /3 G Aq, we have 

n \ 

<*(/;)--£/•(*) h 

i=l ) 



\D 2 -D 2 m< y, h(t)\ 2 + \D. 



2B 



\l\>m n 



where D™ = 2 E - W\ a)g l n (P)} + E \& 

l<\l\<m l<\l\<m 

Using the Cauchy-Schwarz inequality and inequality (4.2), we have that 



\Dl B (P)\<2j2 h(f*)\ E \cp(f*)\+2m n 

\l\<m \p\>m n 



E m/*) 



\p\>m n 



The assumption (2.7) ensures the uniform convergence of D 2B . Consequently, 
since /* is continuous, we deduce the uniform convergence of D\ and D 2 . 
□ 



Lemma 4.2 (Uniqueness of minimum). M has a unique minimum reached 
in point (3 = (3* . 

Proof. First, M 2 , M 1 are nonnegative functions and we have that 
M(/3*) = 0. Consequently, the minimum of M is reached in /3 = (6, a, v) G Aq 
if and only if M 1 (/3) = M 2 {f3) = 0. 

But if M 2 is equal to 0, this implies that v = v* . 

Furthermore, using the Cauchy-Schwarz inequality, we have for all I G 
Z* that \<j){W,a)\ < 1. Since there exist I G such that q(/*) / (/* is 
not constant), M 1 is equal to if and only if the vectors (0^)^=1,. „,j and 
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(aje ll ( 9j )j=i,...,j are proportional for such I. From the identifiability con- 
straints on the model, we deduce that 

a = a* and VleZ such |q(/)| ^ 1(9* - 0) = (2tt). 

Thus it suffices that c\(f) ^ 0, or there exist two relatively prime integers 
/, k such that q(/*) / 0, c^(/*) / in order that 6 = 9*. In other words, 2tt 
is the minimal period of the function /*. In conclusion, M 1 (/3) is equal to 
zero if and only if a = a* and 6 = 9*. □ 



4.2. Proof of Proposition 3.1. The proof is divided in two parts. First, 
we prove that (•,•) is an inner product. Next, we have to choose suitable 
points (a n (h) , f n (h)) in order to establish the LAN property. 



(•,•) is an inner product in %. The form (•,•)% is bilinear, symmetric 
and positive. In order to be an inner product, the form (•,•)% has to be 
definite. In other words, if h G H is such that \\h\\u = 0, we want to prove 
that h = 0. Let h be such a vector; then we have that h a = and for all 
j = 2,...,J, 



(4.4) 



\a*h f + h a jf* - h ej a*df* + h v j\\ h 2 = 



and 



0, 



where p = Ylk=2 ^a,fc a fc- Since the functions hf, f* and df* are orthogonal 
to 1 in L 2 [0,27r], we deduce that h v j = for all j. Moreover, the functions 
hf and /* are continuous and the equation (4.4) implies that oJ^hf = pf* 
and that for all j = 2, . . . , J (/* and df* are orthogonal), 



+ Kj f 



and \\fiQja*df* 



0. 



Since /* is not constant, we deduce that for all j = 2, . . 



and a*p/al 2 + h a j = 0. Consequently, p verifies the equation p J a | + p 



Then p is equal to zero and h ■ 



J that hg j = 
= 0. 



0. 



The LAN property. Let h be in %. In order to satisfy the identifiabil- 
ity constraints of the model, we choose the sequences (a n (h) , f n (h)) [with 
(h) = ((9^(h)) i<j<j,{an\h))i<j<j,(vn\h))i<j<j,a n (h))] such that 

9^(h) = 9* + ^=hgj and a®(h) = a* + -^=h aJ Vj = 2,...,J, 
V n v n 
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J 

j_Vai j) (/ l )2 and a n (h) = a* + 



fn(h) = f n = f* + —7=hf and v%\h) = v * + -j=Kj Vj = l,...,J. 
V n v n 

Using the uniform continuity of df* and hf, we uniformly establish for 

i = 1, ... ,« that 

iWfc - - f n (u - e*) = ^fdf\u - e*) + o(i/v^) vj = i, . . . , j, 

(c#>(/i) - aJJ/'fe) = - 3= l £U nU) + o(l/Vn), 



log(l + — = — -= +o(n i ). 



Then, with the notation of the proposition, we may deduce that 

i n J j 2 

A n (a n (h),f n (h))=A n (h) - ^EE^iW 2 " 2 ^ + ° p(1) - 

i=i j=i CJ 

Sr=i S/=i A™j{h) 2 /n is a Riemann sum which converges to More- 
over, from the Lindeberg-Feller central limit theorem (see [12], Chapter 2) 
A n (/i) converges in distribution to A/"(0, 

4.3. The efficient estimation of 6*, a* and v* . 

Lemma 4.3 (The derivative of u). The representant of the v n 's deriva- 
tive is = ((z>J) 2 <j<j, (i>jh<j<J, )i<j<J G % , u^ere 

*2 

^ = T^-(°' " J ' °> °> °) / 0T 3 = 2 ' ' ' ' ' J ' 

11/ ll L 2 

z>J = (0,0,e„0,0) forj = l,...,J, 

where the vector ej is the jth vector of canonical basis ofM. J , and the vectors 
@ J = ("k)k=2,...,J an d °j = ( a i)k=2,...,j a- r z defined as 

* \ 1/af + l/af, ifk = j, 



a k~\ 1 _ „*2 



a*a* k /J, ifk^j, 



1-af/J, ifk = j. 
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Proof. For hE"H and h! € H, we may rewrite the inner product of the 
tangent space under the following form: 

a* 2 (h, ti) = Jh a h' a + (h f , Jh' f - Ad/*) 

J 

+ E he,k(df -alti f + hg jk a* k df*) 

k=2 

+ E h a>k (r,h' a)k r + 4iPf*) + E h ^ h '^ 

where A = Xlfc=2 fc°fe an< ^ P = Sfc=2 fc a fc- Let & £ {2, . . . , J} be a fixed in- 
teger; we want to find h' such that for all h(zH, (h, h') = hg k . Consequently, 
such h' verifies these equations: 

(4.5) h f = Xdf*/J, h' a = and h' vJ =0, Vj = l,...,J, 

(4.6) (h' aJ + pa*/af)\\r\\ 2 = 0, Vj = 2,..., J, 



(4-7) (-A/J + ^OPr || 2 



if j = k, 



Combining equations (4.6) and (4.7), we have that 

Xaf\\df*\\ 2 /J = a* 2 and pj\\f*\\ 2 /af = 0. 

Thus we deduce that p = and A = J<7* 2 /(a^ 2 ||<9/*|| 2 ). Consequently, /i' is 
equal to z>j?. 

We likewise solve the equation (h, h!) = h a ^. Finally, we have that ||/*|| 2 /3 = 
a* 2 a* 2 a* k l 3 and A = 0. Hence the solution is h! = u k . □ 

Proposition 4.1. Under the assumptions and notation of Theorem 3.1, 
we have that 

la* 

V^VM n (/r) = -— G n + op(l) where l G n = \G e n ,G a n ,G v n ). 

G n is a Gaussian vector which converges in distribution to A/3j_2(0, H) and 
is defined as 



G 



—y 

i n 

G ™ = 777?E 



n A 2 



i=i 

n 



J 



at 



-D + jA 2 54 



dF*(ti)ei,, 



Ij- 



F*{U)£i,, 



G^ = —=y^£i and t e i = t {e it i, . . . ,Si j) for % = 1, . . . ,n. 
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Proof. In order to prove that proposition, we proceed in two steps. 
First, using the notation of Proposition 4.1, we show that 

v^VM n ,(/3*) = M^LKn ~ Lf((3*)) = ~~J~ ( -*n + op(1). 

At the end, we prove that {G n1 G° , G„) is a Gaussian vector which converges 
toA/" 3 ./_ 2 (0,#). 

First, we study singly the gradient of G n , L n and Q n . Let k & {2, . . . , J} 
be fixed. The partial derivative with respect to the variable Ok is 

l<|J|<m„ V 7 

It is bounded by 

T 

2 



—dQ n . t 



<^ E I'liewiEi^i- 

V l<|Z|<m„ j=l 



Thus \fn^ [L {f3*) converges in probability to if m^/n = o(l). Similarly, the 
partial derivative with respect to the variable converges in probability to 
0, too. Consequently, \/nVQ n ((3*) converges to in probability. 

Concerning the deterministic part, the partial derivative with respect to 
9 k is 



k l<\l\<m n I pel ^ 



Using the inequality (4.2), it is bounded by 



Ec P (/*)^(^)^((p-/)r,a*) 



^7^*) = ^t^I 2 E E w 



\l\<rn n 2|p|>n 



+ E I'lf E MDl) }■ 

\l\<m„ \2|p|>n / ) 



\l 

Consequently, we deduce from the assumptions of the theorem that y / n^^ 1 (/3*) 

dDn 
da k 



converges in probability to 0. In like manner, V^TT 21 (/^* ) converges in 
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probability to 0, too. For the partial derivative with respect to Vk, we 
have 

n 



a=l / p£nZ* 

Thus from assumption (3.2), we deduce that y/ndD n / 'duk in /?* converges 
to 0. Finally, ^/nV D n {j3*) converges to in probability. 

Therefore, we have that y/nVM n ((3*) = ^/nVL n (/3*) + o P (l). With the 
notation of Lemma 4.1, we have 

f)j22 9 f 1 

v^M/n = j E R^c-e^i+ajscs)) E <*(/*)[■ 

k l<|i|<m n I 2|p|>n J 

p—lGnZ 

L 22 

The centered Gaussian variable v^^ta/m nas a variance bounded by 



( e Mm) 

\2|p|>n / 



; |Pl> 

From assumption (3.2), we conclude that y/n d ^ n (ft*) converges to in 

OL 22 

probability. In like manner, y/n 0a " converges in probability to 0, too. 
Thus we have that ^nVM n (ft*) = 1ftiVL l n (/3*) - V^VL 21 (/3*) + o P (l). After 
straightforward computations, we obtain 

v^f(/n=-^ E K{z q (r)(4a^)-4^e^)}, 

l<|i|<m n 

*%«n~% £ »{,(/-)(e-.^-| s )}. 

1 ^ 1 1 1 ^ 7TI ji 

r dM n 2a* A 2a* — 

We can now define (G^,G^,G^) as 

= E ^{^(r)(yA 2 :- J D + lA 2 'A]xA+o P (l), 

K|l|<m„ ^ ■ 



G a n = E 4 Q(r) (^ A i /,/ - 1 ) Xr i + 0p(1) and 



l<|/|<m n 

G v n = M{X* } + o F (l 
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where X* denote the independent identically distributed complex Gaussian 
vectors defined as 

Since G 6 n and do not depend on X£, G v n is independent of G e n and G^- 
Moreover, its variance matrix is equal to the identity matrix of R J . Further- 
more, the imaginary part and the real part of ci(f*)X* are independent. 
Consequently, G e n and G% are asymptotically independent with covariance 
matrix || 2 x (D 2 - A 2 A 2 / J) and ||/*|| 2 (7/_i - A 54/aJ 2 ), respectively. 

By the definition of (£ki) (Remark 4.1), we deduce from assumption (3.1) 
that for a fixed k = 1, . . . , J, 



l|i|<m« J * i=l l|f|<m* 



i/ Q (/*)e 



*"\p^(*i- e fe) 



1 ™ 
V 7 ' i=l 



Thus, are equal to the expression defined in the proposition. 

□ 

Proposition 4.2. Under the assumptions and notation of Theorem 3.1, 
we have 

V 2 M n (p n ) % ~H. 

n— >oo J* 

Proof. The matrix —2H/J 2 is the value of the Hessian matrix of M 
in point j3* . We study locally the Hessian matrix of M n . Consequently, we 
may assume that the sequences (/3 n ) are in the following set: 

A l : c = {(0,a,v) G Ao,ax > r and \\{3 - /T II < \\p n - /3*||}, 
where aj > r > 0. Notice that for e > 0, we have 

sup ||V 2 M n (/3) - V 2 M(/T)|| > 2e 
<P( sup ||V 2 M n (/3) - V 2 M(/3)|| >e 

+ p( sup ||V 2 M(/3) - V 2 M(/T)|| >e). 

As in Lemma 4.1, assumptions (3.1) and (3.2) assure the uniform conver- 
gence in probability of V 2 M„ to the Hessian matrix of M on A l ° c . Thus, 
the first term of inequality converges to with n. 
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Since V 2 M is continuous in /?*, there exists 5 > such that 
V 2 M(B((3*,5)) C B(V 2 M(/3*),e). 
Consequently, we have the following inclusion of event: 

sup ||V 2 M(/3)-V 2 M(/r)|| >e) C - > 8). 

Thus, from Theorem 2.1, the second term of the inequality converges to 0, 
too. □ 

4.3.1. The estimation of the common shape. 

Remark 4.2. If the assumptions of Theorem 3.1 hold, we obtain using 
the Cauchy-Schwarz inequality that 

1/2 ( x 1/2 

\ici(n\n 2 \ 

■ \l\>n ) v|Z|>r 

Similarly, if /* is k times differentiable and is squared integrable, we 
have 

E = (n" fc+1 / 2 ) and ]T |q(/*)| 2 = o(n~ 2k ). 

\l\>n \l\>n 

Proof of Corollary 3.1. Using the notation of Lemma 4.1, we have 
for all t £ R, 

= e <*(/V*+ E e* e ^cn 

|£|>m n l<|i|<m n |2p| >n,p— iGnZ 

(4.8) + ^ q(/*)W(0-n,a)-lK W 

l<|/|<m,, 



Z|>n lU|>n J l|Z|>n 



l<|Z|<m„ V 



Since Theorem 3.1 holds and using the delta method, we have for all j 
1,...,J, 

Moreover, we have 

m § _ 9 *),a) - 1| < j^a*\ aj - a*\ + i f>f l^'"^ " 1|- 
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Then, we deduce that 

sup |(4.8)| =Op(l/y/n) and E||(4.8)||£ 2 = 0{l/n). 

teffi 

Using (4.3) and the Cauchy-Schwarz inequality, we have 



E 6 <«^ 



l<|i|<m 



1 



l<K|<m„ J'=l 



f E <^ E Efe 

U K|l|<m„J=l 



3= 

Hence we deduce by the Markov inequality that 

f 2w alt 

W n = P (m n /Vn) and / EW*— = 0{m n / 
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Then, using Remark 4.1, the corollary results. □ 
Lemma 4.4. Let I be in Z*. For a large n, we have 
y/nM(ci0 n ) - = A n (-$l(lci(f*)h f ) + A n ^0, 0, 0, 0, 

+ o P (l), 

V^9(Q(/3n)-Q(r)) = A„(9(/ Q (r)/ i / ) + A n ^0,0,0,0 

+ o P (l), 

^^ = ^(^,0,0,0,^5/*). 

Proof. Let I be in Z*. For n large enough (such wise |/| < m n ), from 
the continuous mapping theorem [12], Theorem 2.3, and from assumption 
(3.1) ensures that cf(f*) converges to cj(/*) with a speed y/n, we obtain 

vW/n " f*) = Mh0n) ~ Cl(f*)), 



sin(/- 

J 



1 + £j(0*,a*) + <*(!). 



3=1 



Since \fn(Q n — 6*,d n — a*) converges in distribution (Theorem 3.1), we use 
the delta method ([12], Chapter 3): 



vW/n - /*) = E -yVnA - D + + 



3=2 
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Thus from Theorem 3.1 and Lemma 4.3 and due to the linearity of A n (-), 
we have 
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